Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetrologyNamespace (initial PR) #9

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

andrewgsavage
Copy link
Contributor

transferred from quantity-dev/quantity-api#5

@lucascolley lucascolley changed the title namespace MetrologyNamespace (initial PR) Mar 17, 2025
def asdimension(obj: str | D) -> D: ...

@staticmethod
def asunit(obj) -> U[D]: ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def asunit(obj) -> U[D]: ...
def asunit(obj: str | U) -> U[D]: ...

should this be type hinted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps #11 to consider

@lucascolley lucascolley linked an issue Mar 27, 2025 that may be closed by this pull request
Copy link

@nstarman nstarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good!
Minor comments about the docstring formatting.

Comment on lines 11 to 16
VT = TypeVar('VT')
DT = TypeVar('DT', bound='Dimension')
UT = TypeVar('UT', bound='Unit[DT]')

@runtime_checkable
class MetrologyNamespace[Q: Quantity[VT, UT, DT], V, U: Unit[DT], D: Dimension](Protocol):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried introducing some TypeVars, but basedmypy isn't happy:

src/metrology_apis/__init__.py:16:38: error: Type variable "metrology_apis.VT" is unbound  [valid-type]
    class MetrologyNamespace[Q: Quantity[VT, UT, DT], V, U: Unit[DT], D: Dimension](Protocol):
                                         ^
src/metrology_apis/__init__.py:16:38: note: (Hint: Use "Generic[VT]" or "Protocol[VT]" base class to bind "VT" inside a class)
src/metrology_apis/__init__.py:16:38: note: (Hint: Use "VT" in function signature to bind "VT" inside a function)

I don't understand what the hints are suggesting though. Any clue @jorenham ?

Copy link

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that there are several type parameters with generic upper bounds (basically everything that uses TypeVar). As I mentioned previously, this is unfortunately not supported in Python.

I'd recommend adding type-checkers to the CI, so that you can avoid this in the future.

@lucascolley
Copy link
Member

@jorenham yes, we have them, you can observe in CI the errors I pointed out at #9 (comment). My question is how to address those errors.

@lucascolley
Copy link
Member

In other words, do we have to use just Quantity[Any, Any, Any], or is there a better solution? The hints from basedmypy seem to be suggesting that there is some sort of alternative.

@jorenham
Copy link

jorenham commented Apr 1, 2025

@jorenham yes, we have them, you can observe in CI the errors I pointed out at #9 (comment). My question is how to address those errors.

Ah nice. I must've missed it because I did not expect type-checkers to run under the "lint" label, but it's of course true that type-checking is a form of linting.

In other words, do we have to use just Quantity[Any, Any, Any], or is there a better solution? The hints from basedmypy seem to be suggesting that there is some sort of alternative.

Basedmypy is indeed the only one that supports it. But it requires that you use all off the type-parameters in the upper bound on the right-hand-side. Put differently; it doesn't accept free type-parameters. That's because free type-parameters don't do anything, so you might as well leave them out (and use Any instead).

However, by relying on this exclusive basedmypy feature, you're effectively requiring all of your users to only use basedmypy. Because, after all, the other type-checkers don't support these parametrized upper bounds of type-parameter.
So if you want to also support e.g. vanilla mypy, vanilla pyright, or basedpyright, then you shouldn't use this feature. I suppose you could consider it backwards-incompatible, in that sense.

The typing-spec compliant alternative is precisely like you said — using Any as type-argument for all generic upper bounds. Depending on the situation you could, of course, also choose something narrower than Any. But I'm afraid that you can't do much better than that — even optype isn't be able to massage this reality.

@lucascolley
Copy link
Member

The typing-spec compliant alternative is precisely like you said — using Any as type-argument for all generic upper bounds. Depending on the situation you could, of course, also choose something narrower than Any. But I'm afraid that you can't do much better than that — even optype isn't be able to massage this reality.

Okay, thanks! How does b70a68b look?

Copy link

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I'm missing something, but I see only two type parameters here:

class Quantity[V, U: Unit](Protocol):

Nitpick: Since you're using Quantity[Any, Any(, Any)?] a couple of times now, you could consider defining a type _AnyQuantity = ... for the the sake of DRY.

@lucascolley
Copy link
Member

I think you're looking at a different commit to the tip of this branch

@lucascolley
Copy link
Member

okay I pushed some aliases. Looks like that has introduced 1 Mypy failure locally

@lucascolley lucascolley requested a review from jorenham April 1, 2025 16:16
Copy link

@nstarman nstarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. Whom else / what libraries needs to approve?
Astropy: @mhvk @nstarman
Pint:
Unyt:

Copy link

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks nice, and the case for __metrology_namespace__ becomes quite clear.

What I'm confused about is the addition of the dimension to a Quantity. Why is that there? This especially if the dimension is something that follows directly from the unit, not a QuantityType. But even in the latter case, is that needed for a minimal API?

But perhaps I'm just misunderstanding how typing works; I also don't really understand why unit is changed to Unit[D: Dimension] - does that at all imply a __class_getitem__ that takes dimension?

@lucascolley
Copy link
Member

Having Quantity parametrised by Dimension means that we can guarantee that Quantity[..., AstropyDimension].__metrology_namespace__.asdimension returns a (subtype of) AstropyDimension. Because of the lack of higher kinded typing, (I am at least under the impression that) there is no way to do this by only parametrising Quantity with a Unit, even though Unit is itself parametrised by a Dimension.

We are definitely into territory where it's helpful to have these things confirmed by those more knowledgeable with typing, though.

@lucascolley
Copy link
Member

I also don't really understand why unit is changed to Unit[D: Dimension]

  1. similarly to Quantity, we want a unit's __metrology_namespace__ to return a matching Dimension, not just any old Dimension
  2. the same holds for Unit.dimension, we are able to guarantee that a Unit[AstropyDimension].dimension returns a (subtype of) AstropyDimension

Return MetrologyNamespace in docstirngs
Copy link

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the dimension type parameters seems like a good choice to me.

From a user perspective, it can become quite verbose if you always have to specify three required type-arguments for Quantity. Two ways to help with that come to mind:

  1. Make them optional by setting PEP 696 type-parameter defaults. If use use Any as default in Quantity, for example, then the QT type alias won't be needed anymore. Implementation-wise, this requires using typing_extensions.TypeVar, for compatibility with python <3.13, which isn't all that pretty unfortunately.
  2. Re-parametrize Quantity using the entire MetrologyNamespace instead. That way, all three type-parameters can be statically known by matching on self: Quantity[MetrologyNamespace[...]] in methods. You're basically moving the problem here, though. Type-checker errors will also become more verbose this way. The additional advantage of this is that you now also know the exact (sub)type of the MetrologyNamespace itself, including any additional (e.g. user-defined) methods and attributes.

@lucascolley
Copy link
Member

for compatibility with python <3.13, which isn't all that pretty unfortunately.

we're actually requiring Python 3.13 right now for ease of development

requires-python = ">= 3.13"
. We can always look at loosening that in the future if things become ready to use before we would want to drop older versions.

@jorenham
Copy link

jorenham commented Apr 2, 2025

for compatibility with python <3.13, which isn't all that pretty unfortunately.

we're actually requiring Python 3.13 right now for ease of development

requires-python = ">= 3.13"
. We can always look at loosening that in the future if things become ready to use before we would want to drop older versions.

Excellent. Then defaults are basically free 👌

@lucascolley
Copy link
Member

is there clear documentation for the syntax for defaults in Python 3.13?

def asunit(obj: str | U) -> U: ...

@staticmethod
def asquantity(obj: Q | V, *, unit: U) -> Q: ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could the unit be entered as a string?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could, but I don't think we should require that? At worst you would just need to do mn.asquantity(v, unit=mn.asunit('length')

Copy link

@nstarman nstarman Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. 2 thoughts:

  1. The QAPI, like the AAPI, represents the minimum support required for an implementing library, so even if we only have unit: U implementing libraries can support strings.
  2. Strings as inputs seem reasonable, but also something we can discuss later. Unit objects as inputs is definitely a must.

@jorenham
Copy link

jorenham commented Apr 2, 2025

is there clear documentation for the syntax for defaults in Python 3.13?

Yea, in PEP 696 for example

@@ -32,7 +110,34 @@ def __rtruediv__(self, other: Self, /) -> Self: ...


@runtime_checkable
class Quantity[V, U: Unit](Protocol):
class Quantity[V, U: UT, D: Dimension](Protocol):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this @jorenham ?

Suggested change
class Quantity[V, U: UT, D: Dimension](Protocol):
class Quantity[V = Any, U = Any, D = Any](Protocol):

@mhvk
Copy link

mhvk commented Apr 2, 2025

Bit larger question about asquantity(obj: V|Q, unit: U): what exactly is done for Q and V here? For V, presumably just attaches the unit. But for Q, I assume the idea is to do unit conversion?

I ask in part since this means one is combining two separate actions into one function, which I'm not totally sure is a good idea. This partially since it just immediately implies that the very first line in any implementation will be if isinstance(obj, Quantity), which seems bad form. But also from a logical point of view: in most other interactions with quantities, any V will be treated as dimensionless. In quantity creation, one explicitly attaches a unit, so it is fine to behave differently for V, but why then not just attach a new unit to a Q? My overall sense would be not to overload, i.e., asquantity should just take a value and a unit.

Now of course we do need a way to convert quantities, and/or more generally convert some V from one unit to another, which maybe should be included here. In principle, that requires one or both of the following:

def convert(obj: V, from_unit: U, to_unit U) -> V:

which would effectively be part of the units API (i.e., independent of Quantity), i.e., to use it for a quantity would be to write asquantity(convert(q.value, q.unit, new_unit), new_unit). Or we can have something that takes a Quantity,

def covert(obj: Q, unit: U) -> Q:

This second version could be used to convert a value via convert(asquantity(value, from_unit), to_unit).value. (Note that in analogy with the array API, asunit might be a more logical name here, but I like asunit for unit creation...).

I'm not sure whether one should just have both (with different names, obviously), just have one, or combine, with from_unit taken from the input if it is a Quantity.

@lucascolley
Copy link
Member

I also feel that separation may be preferable. It does depart from the array API standard in the sense that xp.asarray accepts arrays, but this isn't obviously a bad thing to me. One can always write a convenience function such as the following when wanting to interpret bare values as dimensionless quantities:

def interpret_as_quantity(obj: Quantity[V, U] | V) -> Quantity[V, U]:
    return obj if isinstance(obj, Quantity[V, U]) else mn.asquantity(obj)

@@ -32,7 +110,34 @@ def __rtruediv__(self, other: Self, /) -> Self: ...


@runtime_checkable
class Quantity[V, U: Unit](Protocol):
class Quantity[V, U: UT, D: Dimension](Protocol):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can keep the upper bounds:

Suggested change
class Quantity[V, U: UT, D: Dimension](Protocol):
class Quantity[V = Any, U: UT = Any, D: Dimension = Any](Protocol):



@runtime_checkable
class MetrologyNamespace[Q: QT, V: VT, U: UT, D: Dimension](Protocol):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class MetrologyNamespace[Q: QT, V: VT, U: UT, D: Dimension](Protocol):
class MetrologyNamespace[Q: QT = Any, V: VT = Any, U: UT = Any, D: Dimension = Any](Protocol):

or you could keep e.g. Q van V required:

Suggested change
class MetrologyNamespace[Q: QT, V: VT, U: UT, D: Dimension](Protocol):
class MetrologyNamespace[Q: QT, V: VT, U: UT = Any, D: Dimension = Any](Protocol):

@nstarman
Copy link

nstarman commented Apr 2, 2025

Ok. Rich set of comments from @mhvk and @lucascolley! Here's my 2¢, building off the comments

From @mhvk

Now of course we do need a way to convert quantities, and/or more generally convert some V from one unit to another, which maybe should be included here. In principle, that requires one or both of the following:

def convert(obj: V, from_unit: U, to_unit U) -> V:

which would effectively be part of the units API (i.e., independent of Quantity), i.e., to use it for a quantity would be to write asquantity(convert(q.value, q.unit, new_unit), new_unit). Or we can have something that takes a Quantity,

def convert(obj: Q, unit: U) -> Q:
  1. I propose the function name uconvert for doing the unit conversion of quantities. We can't have name collision, so convert directly should be avoided (except for illustrative purposes as was the intent in the comment 😄)
def uconvert(obj: Q[U], unit: U) -> Q[U]: ...
  1. I'm a big fan of the idea of def convert(obj: V, from_unit: U, to_unit U) -> V:. There isn't currently an exact function like this in any library I'm aware of. Astropy comes close with Unit.to. The major use of this function is that it wraps the actual operation of the unit conversion on the value. All libraries perform this function by some means and I think it's worth making available to the user. I know that it is useful in the context of ML, where I've written functions like this to interface a physics-related function to a unit-free (😢) ML NN. I propose it be named uconvert_value.
def uconvert_value(obj: V, from_unit: U, to_unit U) -> V:

Then (this is not enforced by the API and would be discussed only in documentation) a very reasonable implementation of uconvert in an implementing library is

def uconvert(obj: Q[U], unit: U) -> Q[U]:
    return asquantity(uconvert_value(obj.value, obj.unit, unit), unit)

Which very nicely separates the concerns of a) converting the value in one unit to another and b) reconstituting the Quantity.
Of course if there are multiple Q classes (like in Astropy), then the logic gets a bit more complicated, but the overall point remains.

From @mhvk

My overall sense would be not to overload, i.e., asquantity should just take a value and a unit.
From @lucascolley
I also feel that separation may be preferable. It does depart from the array API standard in the sense that xp.asarray accepts arrays, but this isn't obviously a bad thing to me. One can always write a convenience function such as the following when wanting to interpret bare values as dimensionless quantities:

I see your perspectives and strongly sympathize. But I think I disagree. First, what I do 100% agree on is that

def asquantity(value: V, unit: U) -> Q[V, U]: ...

is the core operation of asquantity, lifting values to quantities.

However, I do think that the similarity to xp.asarray and that for basically every library I know the Quantity constructor can accept a quantity means asquantity should do similarly.
That being said, I think it's AOK to push this off to QAPI v2, whenever that is after we've converged on this in-progress v1.
I just suspect users will communicate with implementing libraries who will then communicate with us that the signature should be

@overload
def asquantity(value: V, unit: U) -> Q[V, U]: ...
@overload
def asquantity(value: Q[V, U], unit: U) -> Q[V, U]: ...

The overlap between asquantity and uconvert is unfortunate, but the uconvert is much more focused and asquantity should use its uconvert method under-the-hood, making asquantity the flexible function that it is, analogous to how flexible xp.asarray is in the AAPI, generally supporting dtype conversion.

For example, something which asquantity might support that uconvert should not is a dtype argument if the implementing library only works with V: Array (uconvert will encounter dtype promotion only through the machinery of the underlying array, so any required int->float promotion does not require an argument to uconvert).

# ExampleLibrary.py (implementing and extending the QAPI)
def asquantity(value: V, unit: U, dtype: DType = None) -> Q[V, U]: ...

@lucascolley
Copy link
Member

Thanks @nstarman, that sounds pretty good to me! If all existing libraries would be happy with implementing uconvert and uconvert_value then that sounds useful.

@mhvk
Copy link

mhvk commented Apr 2, 2025

I like the idea but the names not as much 😺

Note that this being on the metrology namespace, it is OK to have a more common name ("Namespaces are one honking great idea" and all that). Maybe convert(from_unit, to_unit, value) and to_unit(q, new_unit)?

Aside, astropy has the concept of equivalencies (treating, e.g., radian as dimensionless, or "arcsec" as equivalent to "parsec" (just do the inverse), or temperature as equivalent to energy (multiply with k_B). Definitely not something one should insist on, obviously, but something not to preclude either, i.e., we may want to leave **kwargs? Or should the recommendation to use context managers? (Probably also more for v2 or even v3!)

@phlptp
Copy link

phlptp commented Apr 2, 2025

Suggested I move this here from #16

List of relevant to conversion methods from the library I maintain.
Units

  • is_exactly_the_same(Unit)->bool
  • has_same_base(Unit)->bool This is effectively the same as what a dimension would compare as equivalent
  • equivalent_non_counting(Unit)->bool ignore mol|radian|count
  • is_convertible_to(Unit|str)->bool There are some built in assumptions and conversions that will allow conversion where the units are not of the same base. The library has some notions of commodities so can include things like densities and energy densities in a few cases. also things like mass to weight conversions assuming standard gravity.
  • convert(float,Unit|str)->float
  • to same as convert

Measurement|Quantity

  • value_as(Unit|str)->float
  • convert_to(Unit|str)->Quantity
  • to same as convert_to
  • convert_to_base()->Quantity convert the units to a base units for a given Quantity eg. ft would convert to m
  • as_unit()->Unit returns the quantity as a new unit that can be used in other Quantities.

Standalone methods

  • convert(float, Unit|Str,Unit|str)->float
  • convert_pu(float,Unit|str, Unit|str, float)->float includes base value for conversion of per unit values.

@mhvk
Copy link

mhvk commented Apr 2, 2025

Suggested I move this here from #16

Thanks! I'll add astropy equivalents, though will restate that we really should try to do the minimum. That said, if every unit library implements something, perhaps that is a sign that it belongs in the absolute minimum...

List of relevant to conversion methods from the library I maintain. Units

* `is_exactly_the_same(Unit)->bool`

Unit.__eq__

* `has_same_base(Unit)->bool`   This is effectively the same as what a dimension would compare as equivalent

Unit.is_equivalent(unit)

* `equivalent_non_counting(Unit)->bool`  ignore mol|radian|count

Unit.is_equivalent(unit, equivalencies=u.dimensionless_angles()) (or other equivalencies). Personally, I think one should not have separate methods for hardcoded assumptions about how to treat units - easier to do with options instead.

* `is_convertible_to(Unit|str)->bool`   There are some built in assumptions and conversions that will allow conversion where the units are not of the same base.  The library has some notions of commodities so can include things like densities and energy densities in a few cases.   also things like mass to weight conversions assuming standard gravity.

Unit.is_equivalent(unit, equivalencies=...)`, i.e., same as above, but we allow specifying what constitutes convertible units (including functions to possibly do the actual conversion).

* `convert(float,Unit|str)->float`
* `to`   same as convert

Unit.to(unit, equivalencies=...) - we again allow the user to tell how to deal with units that are not normally convertible (e.g., convert flux in wavelength units to one in frequency units).

Measurement|Quantity

* `value_as(Unit|str)->float`

Quantity.to_value(unit[, equivalencies=...])

* `convert_to(Unit|str)->Quantity`
* `to`  same as convert_to

Quantity.to(unit[, equivalencies=...])

* `convert_to_base()->Quantity`  convert the units to a base units for a given Quantity  eg.  `ft` would convert to `m`

Quantity.decompose(unit_system) (SI by default)

* `as_unit()->Unit`   returns the quantity as a new unit that can be used in other Quantities.

Unit(quantity) (to me this seems not a concern of Quantity, but obviously there are two ways to think about this!).

Standalone methods

  • convert(float, Unit|Str,Unit|str)->float

We don't have this but the closest is Unit.to(new_unit, value) (where value can be array)

* `convert_pu(float,Unit|str, Unit|str, float)->float`  includes base value for conversion of per unit values.

Is this for C to F, etc.? My sense would be that such things need a different unit class that know how what their offset is. In astropy, we actually never got around to dealing with this properly (since thankfully almost nobody uses C or F), but we do something similar for magnitudes, decibels and other logarithmic units. It is perhaps relevant, since that is one argument for carrying the conversion method on the unit rather than as a function (maybe better post separately below).

@mhvk
Copy link

mhvk commented Apr 2, 2025

Further to the above, maybe some relevant points (though likely for v2!):

  • A general function like convert(unit1, unit2, value) in the end needs the units to tell how to do the conversion (given, e.g., logarithmic units like decibel). I think that is absolutely fine, but possibly makes the case for a proscribed method on the unit slightly stronger (though, on balance, I think we should leave it as an implementation detail).
  • Missing in our discussion so far is the need for a general function that tells for a given operation/function, (1) how to convert the input arguments, (2) what the result unit is. I.e., for addition: arg1: no conversion, arg2, convert to unit1, result_unit: unit1; for sin: convert to rad, result dimensionless. Do we need to define an API for this, or is it an implementation detail? One could envision something analogous to result_dtype in the array API (although here it needs the function and other of the arguments matters), say result_unit(function_name: str, units: tuple[U, ...]) -> tuple[result_unit: U, tuple[callable, ...]]

@phlptp
Copy link

phlptp commented Apr 2, 2025

Is this for C to F, etc.? My sense would be that such things need a different unit class that know how what their offset is. In >astropy, we actually never got around to dealing with this properly (since thankfully almost nobody uses C or F), but we do >something similar for magnitudes, decibels and other logarithmic units. It is perhaps relevant, since that is one argument for >carrying the conversion method on the unit rather than as a function (maybe better post separately below).

one of the key requirements for the library was to support power systems units, which look like per-unit MW or per-unit V. These are essentially dimensionless units but it is necessary to track the original unit as well so to know what it can legitimately be converted back to. In the computation this allows transformers to be mostly ignored in some computations. But we need a mechanism to get back to the original values as well. Hence the use of dedicated per-unit flag in the unit definition. I wouldn't really suggest trying to make this mechanism more broadly applicable as it is a rather niche use case, though we do also use for other things like strain for example and other types of ratio measurements.

For C to F and R and a few other temperature units there is a separate chunk of code that knows about those and do the appropriate conversion. There is also an equation flag and overloaded bit codes that know of different types of operations and inverse operations for things like dB and bel, and a few other logarithmic type operations that are inherent in some units. There isn't a general datum shift mechanism though.

@mhvk
Copy link

mhvk commented Apr 2, 2025

Interesting. That sounds not dissimilar to our logarithmic units, where we also have to keep track of the reference unit (e.g., dBm = Decibel(mW)). It is definitely tricky to get those things right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Quantity: constructing quantities
7 participants